Internet Info 1997 December

home *** CD-ROM | disk | FTP | other *** search

/ Internet Info 1997 December / Internet_Info_CD-ROM_Walnut_Creek_December_1997.iso / ietf / urn / urn-archives / urn-ietf.archive.9610 / 000064_owner-urn-ietf _Mon Oct 21 15:59:43 1996.msg < prev next >

Wrap

Internet Message Format | 1997-02-19 | 4KB

Received: (from daemon@localhost) by services.bunyip.com (8.6.10/8.6.9) id PAA28934 for urn-ietf-out; Mon, 21 Oct 1996 15:59:43 -0400 Received: from mocha.bunyip.com (mocha.Bunyip.Com [192.197.208.1]) by services.bunyip.com (8.6.10/8.6.9) with SMTP id PAA28927 for <urn-ietf@services.bunyip.com>; Mon, 21 Oct 1996 15:59:40 -0400 Received: from windrose.omaha.ne.us by mocha.bunyip.com with SMTP (5.65a/IDA-1.4.2b/CC-Guru-2b) id AA23206 (mail destined for urn-ietf@services.bunyip.com); Mon, 21 Oct 96 15:59:38 -0400 Message-Id: <9610211959.AA23206@mocha.bunyip.com> Received: by privateer.windrose.omaha.ne.us; Mon Oct 21 15:01 CDT 1996 From: "Ryan Moats" <jayhawk@ds.internic.net> To: "Martin J Duerst" <mduerst@ifi.unizh.ch>, "Patrik Faltstrom" <paf@swip.net> Cc: "urn-ietf@bunyip.com" <urn-ietf@bunyip.com> Date: Mon, 21 Oct 96 15:04:20 Priority: Normal X-Mailer: PMMail 1.52 For OS/2 UNREGISTERED SHAREWARE Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Subject: Re: [URN] Pre release of URN Syntax document.... Sender: owner-urn-ietf@services.bunyip.com Precedence: bulk Reply-To: "Ryan Moats" <jayhawk@ds.internic.net> Errors-To: owner-urn-ietf@bunyip.com On Mon, 21 Oct 1996 20:36:58 +0200 (MET DST), Patrik Faltstrom wrote: > >On Mon, 21 Oct 1996, Martin J Duerst wrote: > >> The other big problem is equivalence. For Unicode, character equivalence >> is in some cases not the same as codepoint equivalence. Charecter >> equivalence is well defined, but there is currently no standard >> for normalization. > >What exists are the decomposition rules for Unicode. Those are well defined. True, but I should probably add some stuff to section 3 about this. >> This does not concern things such as case, >> where the user can easily distinguish lower case and upper case, >> but cases such as A-with-Grave, which can be encoded both as >> one single codepoint and as the sequence A, Grave. > >The casing is also defined in the Unicode spec. > >What you do is to first decompose the string (i.e. change >A-with-Grave into A, Grave), and then do the case insensitive >comparison (if you want to). > >All of this is implemented in the Digger Whois++ server from >Bunyip. It is doable! Ah, yes but in the server, rather than in the client... >> Another problem that should be considered are bidirectionality >> for Hebrew and Arabic. > >It should be defined, just like in MIME, that characters are >stored in the order they are stored, not displayed. (At least >I remember that we discussed this with display order in the >MIME community in an interesting meeting faaaar back in time, >which was ended by looking at some example where two hebrew >users, using the same computer, was using different display >order...). I agree with Patrik. Regardless of glyph direction order, NSS octets are in order of presentation ie. (for left->right, left most character first, for right->left right most character first, etc.) >> It would be a good idea if URN resolvers could alos care about >> UNicode character equivalence. But maybe some schemes would do >> that easier than others? > >It is the resolution service that have to care about character >equivalence. The URN spec should only say UTF-8 from my point of >view. It could though be noted that A-with-grave and A, Grave in >Unicode is regarded as an example of equivalent characters. I agree with Patrik again. I have added the note. Ryan